# Efficient Training
Nanovlm 450M
MIT
nanoVLM is a lightweight vision-language model (VLM) designed for efficient training and experimentation.
Image-to-Text
Safetensors
N
lusxvr
339
2
Nanovlm
MIT
nanoVLM is a lightweight vision-language model (VLM) designed for efficient training and experimentation.
Image-to-Text
Safetensors
N
andito
187
1
Qwen2.5 Coder 7B NEP Fix
Apache-2.0
A text generation and inference model optimized using Unsloth and TRL libraries based on the Qwen/Qwen2.5-Coder-7B model, achieving 2x faster training speed
Large Language Model
Transformers English

Q
lurf21
20
1
Bonsai
Bonsai is a small ternary-weighted language model with 500 million parameters, built on the Llama architecture and using the Mistral tokenizer, trained on fewer than 5 billion tokens.
Large Language Model
Transformers

B
deepgrove
113
8
RWKV7 Goose Pile 168M HF
Apache-2.0
RWKV-7 model using Flash Linear Attention format, trained on the Pile dataset, supporting English text generation tasks.
Large Language Model
Transformers English

R
RWKV
57
2
Traceback 12b
Apache-2.0
TraceBack 12b is a 4bit quantized version based on the Mistral-Nemo-Instruct architecture, focusing on instruction-following and chain-of-thought reasoning tasks.
Large Language Model
Transformers

T
secemp9
1,470
29
Slam
MIT
This is a speech language model based on discrete Hubert tokens, focusing on efficient training and capable of generating speech segment continuations.
Audio Generation
Transformers

S
slprl
115
10
Open Reasoner Zero 7B
MIT
Open Reasoner Zero is an open-source solution for large-scale reinforcement learning based on foundational models, focusing on scalability, simplicity, and ease of use for large-scale reasoning-oriented reinforcement learning.
Large Language Model
Transformers

O
Open-Reasoner-Zero
776
28
Deepseek R1 Distill Llama 8B Finance V1
Apache-2.0
This is a financial domain language model fine-tuned based on the DeepSeek-R1-Distill-Llama-8B model, optimized using LoRA technology, suitable for financial Q&A and instruction tasks.
Large Language Model
Transformers English

D
abhi9ab
1,227
6
Llama 3.2 11B Vision Radiology Mini
Apache-2.0
Vision instruction fine-tuned model optimized with Unsloth, supporting multimodal task processing
Text-to-Image
Transformers English

L
mervinpraison
39
2
Llama 3 Instruct 8B SimPO
SimPO is a preference optimization method that eliminates the need for reference reward models, simplifying the traditional RLHF pipeline by directly optimizing language models with preference data.
Large Language Model
Transformers

L
princeton-nlp
1,924
58
Mistral Supra
Apache-2.0
Mistral-SUPRA is a linear RNN model initialized based on Mistral-7B, combining the functions of Transformer and recurrent models.
Large Language Model
PyTorch English
M
TRI-ML
163
12
Moe LLaVA Qwen 1.8B 4e
Apache-2.0
MoE-LLaVA is a large vision-language model based on the Mixture of Experts architecture, achieving efficient multimodal learning through sparse activation parameters
Text-to-Image
Transformers

M
LanguageBind
176
14
Is New Dataset Teacher Model
Apache-2.0
A few-shot learning text classification model based on the SetFit framework, achieving efficient classification through contrastive learning and classification head training
Text Classification
I
librarian-bots
168
1
Godot Dodo 4x 60k Llama 13b
Godot-Dodo is an instruction-following model fine-tuned from LLaMA 13B, specializing in code instruction understanding and generation tasks
Large Language Model
Transformers

G
minosu
43
8
Pepe
An image classification model provided by Keras, supporting multiple pre-trained architectures and suitable for common image classification tasks.
Image Classification
P
PeskyAmiable
0
0
Ppo Pendulum V1
This is a reinforcement learning model based on the PPO algorithm, designed to solve control problems in the Pendulum-v1 environment.
Physics Model
P
ernestumorga
16
0
Distilbert Dot Tas B B256 Msmarco
A dual-encoder dot-product scoring architecture based on DistilBert, trained on the MSMARCO-Passage dataset with balanced topic-aware sampling, suitable for dense retrieval and candidate set re-ranking
Text Embedding
Transformers English

D
sebastian-hofstaetter
3,188
23
Deit Base Patch16 224
Apache-2.0
DeiT is a data-efficient image Transformer model trained with attention mechanisms, pretrained and fine-tuned on the ImageNet-1k dataset at 224x224 resolution.
Image Classification
Transformers

D
facebook
152.63k
13
Bert Mini Finetuned Squadv2
This model is based on the BERT-mini architecture, fine-tuned on the SQuAD 2.0 dataset using the M-FAC second-order optimizer for question answering tasks.
Question Answering System
Transformers

B
M-FAC
17
0
Featured Recommended AI Models